oracle add redolog member

客户的一个数据库需要在现有的redo group中再添加几个member,如下是一个简单的操作记录,以备忘:

SQL> select * from v$log;

GROUP# THREAD# SEQUENCE# BYTES                   MEMBERS           ARC STATUS FIRST_CHANGE#          FIRST_TIME
———- ———-            ———- ———-                  ———                     – — —————- ————- ————
1                1                        4                         52428800            1                          NO       CURRENT      661747                    30-8月 -12
2                1                       2                          52428800            1                          NO       INACTIVE       590225                  28-8月 -12
3                1                       3                          52428800            1                          NO        INACTIVE       617790                  28-8月 -12

可以看到在添加之前,数据库一共有三组redo group,每个group中一个member,大小为50M,如下操作为每个group中再添加一个member

SQL> alter database add logfile member 'C:ORACLEPRODUCT10.2.0ORADATACRDREDO01_2.LOG' to group 1;

数据库已更改。

SQL> alter database add logfile member 'C:ORACLEPRODUCT10.2.0ORADATACRDREDO02_2.LOG' to group 2;

数据库已更改。

SQL> alter database add logfile member 'C:ORACLEPRODUCT10.2.0ORADATACRDREDO03_2.LOG' to group 3;

数据库已更改。

添加完毕的redo member还是处于INVALID 状态,后续可以手工alter database switch logfile来使其online.

 

SQL> select * from v$logfile;

GROUP# STATUS TYPE MEMBER
—   ——-              ——-            ——– ———————————————-
3                        ONLINE       C:ORACLEPRODUCT10.2.0ORADATACRDREDO03.LOG
2                        ONLINE        C:ORACLEPRODUCT10.2.0ORADATACRDREDO02.LOG
1                         ONLINE       C:ORACLEPRODUCT10.2.0ORADATACRDREDO01.LOG
1 INVALID        ONLINE        C:ORACLEPRODUCT10.2.0ORADATACRDREDO01_2.LOG
2 INVALID        ONLINE       C:ORACLEPRODUCT10.2.0ORADATACRDREDO02_2.LOG

3 INVALID         ONLINE        C:ORACLEPRODUCT10.2.0ORADATACRDREDO03_2.LOG

几次alter database switch logfile后,INVALID状态就会消失了。

–EOF–

 

how to query the priviledge of directory

遇到一个疑惑,客户让帮忙看看如何才能查到一个已经被授权read,write某directory权限的用户,其read,write的权限显示在了哪里,于是想都没想给了客户dba_directories视图的建议,但dba_directories视图确实没有想象的需要的东西

SQL> desc dba_directories;
Name Null? Type
—————————————————————– ——– ——————————————–
OWNER                                        NOT NULL                  VARCHAR2(30)
DIRECTORY_NAME                NOT NULL                  VARCHAR2(30)
DIRECTORY_PATH                                                        VARCHAR2(4000)

做个试验。。。follow this…

//create the directory

1* create or replace directory expdir as ‘/home/oracle/expdir’
SQL> /

Directory created.

1* grant read,write on directory expdir to scott
SQL> /

Grant succeeded.

现在scott用户应当对于directory expdir有读、写的权限,这read,write到底纪录在哪里呢?说到底,directory也是oracle中的一个object,那这部分的权限信息纪录在哪里呢?

directory的授权信息都记录在了dba_tab_privs里面,大跌眼镜啊。。directory被oracle当做table纪录在了dba_tab_privs视图中,看来以后真不能“以貌取人”了。。

1* select grantee,owner,grantor,PRIVILEGE from dba_tab_privs where table_name=‘EXPDIR’
SQL> /

GRANTEE                 OWNER              GRANTOR                 PRIVILEGE
—————————— —————————— ——————————
SCOTT                         SYS                          SYS                             READ
SCOTT                         SYS                          SYS                            WRITE

–EOF–

对归档日志的验证失败-rman语言显示的一个”异常”

在客户现场,帮忙查看rman备份的信息,在做crosscheck archivelog all时收到如下的信息,于是咨询客户是否在os上做了删除归档日志的操作,客户反馈没有。

—————————————————–
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001829_0765458436.0001 记录 ID=1805 时间戳 =791704822
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001830_0765458436.0001 记录 ID=1806 时间戳 =791719618
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001831_0765458436.0001 记录 ID=1807 时间戳 =791751623
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001832_0765458436.0001 记录 ID=1808 时间戳 =791771151
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001833_0765458436.0001 记录 ID=1809 时间戳 =791778305
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001834_0765458436.0001 记录 ID=1810 时间戳 =791822474
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001835_0765458436.0001 记录 ID=1811 时间戳 =791848825
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001836_0765458436.0001 记录 ID=1812 时间戳 =791857576
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001837_0765458436.0001 记录 ID=1813 时间戳 =791863234
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001838_0765458436.0001 记录 ID=1814 时间戳 =791906945
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001839_0765458436.0001 记录 ID=1815 时间戳 =791935222
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001840_0765458436.0001 记录 ID=1816 时间戳 =791944004
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001841_0765458436.0001 记录 ID=1817 时间戳 =791944007
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001842_0765458436.0001 记录 ID=1818 时间戳 =791982026
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001843_0765458436.0001 记录 ID=1819 时间戳 =792021613
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001844_0765458436.0001 记录 ID=1820 时间戳 =792021652
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001845_0765458436.0001 记录 ID=1821 时间戳 =792030432
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001846_0765458436.0001 记录 ID=1822 时间戳 =792049858
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001847_0765458436.0001 记录 ID=1823 时间戳 =792093621
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001848_0765458436.0001 记录 ID=1824 时间戳 =792108026
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001849_0765458436.0001 记录 ID=1825 时间戳 =792116856
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001850_0765458436.0001 记录 ID=1826 时间戳 =792122166
对归档日志的验证失败
存档日志文件名 =D:ORADATAARCHARC0000001851_0765458436.0001 记录 ID=1827 时间戳 =792165652
已交叉检验的 75 对象

———————————————–

那还真是奇怪了…..突然脑子里闪出rman语言相关的一个异常显示问题,于是设置环境变量,改成英文显示。

oracle@gongwenServer ~]$ export NLS_LANG=”american_america.zhs16gbk”

字符集不变,设置为american_america,再次crosscheck,你看…….还真是一个“翻译错误”…..

RMAN> crosscheck archivelog all;

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=1158 devtype=DISK
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001777_0765458436.0001 recid=1753 stamp=790701106
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001778_0765458436.0001 recid=1754 stamp=790725626
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001779_0765458436.0001 recid=1755 stamp=790733192
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001780_0765458436.0001 recid=1756 stamp=790733194
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001781_0765458436.0001 recid=1757 stamp=790772102
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001782_0765458436.0001 recid=1758 stamp=790812015
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001783_0765458436.0001 recid=1759 stamp=790815646
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001784_0765458436.0001 recid=1760 stamp=790819620
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001785_0765458436.0001 recid=1761 stamp=790837252
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001786_0765458436.0001 recid=1762 stamp=790873567
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001787_0765458436.0001 recid=1763 stamp=790898421
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001788_0765458436.0001 recid=1764 stamp=790906044
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001789_0765458436.0001 recid=1765 stamp=790906687
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001790_0765458436.0001 recid=1766 stamp=790945223
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001791_0765458436.0001 recid=1767 stamp=790984816
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001792_0765458436.0001 recid=1768 stamp=790989334
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001793_0765458436.0001 recid=1769 stamp=790993315
validation succeeded for archived log
archive log filename=D:ORADATAARCHARC0000001794_0765458436.0001 recid=1770 stamp=791013622

oracle rac srvctl create service

oracle rac创建service的另外一个方法是使用srvctl工具,使用srvctl工具的好处是无需gui的配合,直接在命令行即可搞定了。

//先看看srvctl add service的使用方法
[oracle@rac1 ~]$ srvctl add service -h
Usage: srvctl add service -d <name> -s <service_name> -r “<preferred_list>” [-a “<available_list>”] [-P <TAF_policy>]
-d <name> Unique name for the database
-s <service> Service name
-r “<pref_list>” List of preferred instances
-a “<avail_list>” List of available instances
-P <TAF_policy> TAF policy (NONE, BASIC, or PRECONNECT)
Usage: srvctl add service -d <name> -s <service_name> -u {-r “<new_pref_inst>” | -a “<new_avail_inst>”}
-d <name> Unique name for the database
-s <service> Service name
-u Add a new instance to service configuration
-r <new_pref_inst> Name of new preferred instance
-a <new_avail_inst> Name of new available instance
-h Print usage
//添加一个service,名字叫service2,rac1作为pre_list成员,rac2作为avail_list成员,TAF policy配置为PRECONNECT
[oracle@rac1 ~]$ srvctl add service -d rac -s service2 -r “rac1” -a “rac2” -P PRECONNECT

//crs_stat 查看一下添加的结果,有两个状态为offline的。
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.rac.db application ONLINE ONLINE rac1
ora….c1.inst application ONLINE ONLINE rac1
ora….c2.inst application ONLINE ONLINE rac2
ora….ice1.cs application ONLINE ONLINE rac1
ora….ac1.srv application ONLINE ONLINE rac1
ora….ice2.cs application OFFLINE OFFLINE
ora….ac1.srv application OFFLINE OFFLINE
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
//新增加的两个offline的资源
NAME=ora.rac.service2.cs
TYPE=application
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.rac.service2.rac1.srv
TYPE=application
TARGET=OFFLINE
STATE=OFFLINE

//service创建完毕后,需要配置这个服务,将其激活,但此操作似乎是多余的,接下来会收到报错
[oracle@rac1 ~]$ srvctl enable service -d rac -s service2
PRKP-1018 : Service service2 already enabled.

//来启动enable完毕的service
[oracle@rac1 ~]$ srvctl start service -d rac -s service2
//刚才offline的两个服务自动被启动了,状态为online
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.rac.db application ONLINE ONLINE rac1
ora….c1.inst application ONLINE ONLINE rac1
ora….c2.inst application ONLINE ONLINE rac2
ora….ice1.cs application ONLINE ONLINE rac1
ora….ac1.srv application ONLINE ONLINE rac1
ora….ice2.cs application ONLINE ONLINE rac1
ora….ac1.srv application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
//使用srvctl config来查看service的状态,可以看到新创建的service2的一些信息,这里TAF选择的是PRECONNECT。
[oracle@rac1 ~]$ srvctl config service -d rac -a
service1 PREF: rac1 AVAIL: rac2 TAF: basic
service2 PREF: rac1 AVAIL: rac2 TAF: PRECONNECT
//srvctl工具创建的服务能够自动添加到spfile中,show parameter已经能够查到其踪迹了
SQL> show parameter service

NAME TYPE VALUE
———————————— ———– ——————————
service_names          string           service1,   rac.world,    service2
//service2的添加已经反应到了spfile参数文件中(不知道如果是pfile会如何)

//查看一下在数据字典中添加了什么内容?
1* select name,FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY from dba_services
SQL> /

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY
—————————— —————————— —————————— —————- ————–
SYS$BACKGROUND
SYS$USERS
seeddataXDB
seeddata
racXDB
rac.world
service1                                        BASIC                                 SELECT                     170                           4
service2
service2_PRECONNECT

//可以看到在数据字典中,添加了两行与service2相关的内容,name字段中有service2、service2_PRECONNECT
//其中service2在创建过程中,TAF的method选择的是PRECONNECT,这里与之前的basic不同。
//与basic不同的是,可见在dba_services中,name字段多了两行,service2与service_PRECONNECT

//但是在TAF policy的4个选项(method,type,retry,delay)也同样没有配置上,需要借助dbms包来解决。

//使用dbms_service包来配置TAF的四个策略选项,method配置为PRECONNECT,type配置为session,retries=150,delay设置为5
//shit…正当我在模仿如下的脚本准备相应的脚本时,无意间查了一些dbms_service中关于method如何设置为PRECONNECT时,还自己意淫相应的方法为failover_method_preconnect是发现,如下一段文字:

//
Server side TAF method is BASIC. BASIC is the only value currently supported.
This means that a new connection is established at failure time.
It is not possible to pre-establish a backup connection. (which is to say, PRECONNECT is not supported)
//


begin
dbms_service.modify_service(
service_name=>’service2′,
failover_method=>dbms_service.failover_method_basic,
failover_type=>dbms_service.failover_type_session,
failover_retries=>160,
failover_delay=>5,
);
end;
/

//我日….悲催啊。。。”basic is the only value current supported…..”

//得,srvctl更改service为的method为basic,重新来
[oracle@rac1 ~]$ srvctl modify service -d rac -s service2 -h
Usage: srvctl modify service -d <name> -s <service_name> -i <old_inst_name> -t <new_inst_name> [-f]
-d <name> Unique name for the database
-s <service> Service name
-i <old_inst> Old instance name
-t <new_inst> New instance name
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d <name> -s <service_name> -i <avail_inst_name> -r [-f]
-d <name> Unique name for the database
-s <service> Service name
-i <inst> Instance name
-r Upgrade instance to preferred
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d <name> -s <service_name> -n -i <prefered_inst> [-a <available_list>] [-f]
-d <name> Unique name for the database
-s <service> Service name
-n Modify service configuration
-r “<pref_list>” List of preferred instances
-a “<avail_list>” List of available instances
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d <name> -s <service_name> -P <TAF_policy> [-f]
-d <name> Unique name for the database
-s <service> Service name
-P <TAF_policy> TAF policy (NONE, BASIC, or PRECONNECT)
-f Disconnect all sessions during stop or relocate service operations
-h Print usage

//执行更改
[oracle@rac1 ~]$ srvctl modify service -d rac -s service2 -P BASIC
//查看状态
[oracle@rac1 ~]$ srvctl config service -d rac -a
service1 PREF: rac1 AVAIL: rac2 TAF: basic
service2 PREF: rac1 AVAIL: rac2 TAF: BASIC
//crs stack中注册的service相关的两个resource都依旧是online的。
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.rac.db application ONLINE ONLINE rac1
ora….c1.inst application ONLINE ONLINE rac1
ora….c2.inst application ONLINE ONLINE rac2
ora….ice1.cs application ONLINE ONLINE rac1
ora….ac1.srv application ONLINE ONLINE rac1
ora….ice2.cs application ONLINE ONLINE rac1
ora….ac1.srv application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2

//再看看spfile中的参数,spfile中的记录并没有什么变化。
SQL> show parameter service

NAME TYPE VALUE
———————————— ———–     ——————————
service_names                string               service1,   rac.world,    service2

//看看dba_services中,数据字典中的记录。
SQL> l
1* select name,FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY from dba_services
SQL> /

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY
—————————— —————————— —————————— —————- ————–
SYS$BACKGROUND
SYS$USERS
seeddataXDB
seeddata
racXDB
rac.world
service1 BASIC SELECT 170 4
service2
service2_PRECONNECT
//dba_services中似乎没有什么反应,依旧存在一个service2_PRECONNECT的东西,再看看lsnrctl中的记录信息
LSNRCTL> set current_listener listener_rac1
Current Listener is listener_rac1
LSNRCTL> status
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.201.149)(PORT=1521)(IP=FIRST)))
STATUS of the LISTENER
————————
Alias LISTENER_RAC1
Version TNSLSNR for Linux: Version 10.2.0.5.0 – Production
Start Date 20-AUG-2012 18:13:34
Uptime 3 days 19 hr. 11 min. 7 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /opt/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File /opt/app/oracle/product/10.2.0/db_1/network/log/listener_rac1.log
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.201.149)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.201.128)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary…
Service “rac.world” has 2 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Instance “rac2”, status READY, has 1 handler(s) for this service…
Service “racXDB.world” has 2 instance(s).
Instance “rac1”, status READY, has 1 handler(s) for this service…
Instance “rac2”, status READY, has 1 handler(s) for this service…
Service “rac_XPT.world” has 2 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Instance “rac2”, status READY, has 1 handler(s) for this service…
Service “service1.world” has 1 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Service “service2.world” has 1 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Service “service2_PRECONNECT.world” has 1 instance(s).
Instance “rac2”, status READY, has 1 handler(s) for this service…
The command completed successfully
//监听中还是记录了一些service2相关的信息,但是service2以及service2_PRECONNECT多少让人有些疑惑,不知道是modify service之前的状态还是之后的状态。

//即使reload listener之后,还是如下的状态(lsnrctl services查看)
Service “service2.world” has 1 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip.localdomain)(PORT=1521))
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
Service “service2_PRECONNECT.world” has 1 instance(s).
Instance “rac2”, status READY, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac2-vip.localdomain)(PORT=1521))
The command completed successfully

//也罢,继续使用dbms包完成剩下在数据字典中的更改工作。method=basic,type=session,retry=150,delay=5

begin
dbms_service.modify_service(
service_name=>’service2′,
failover_method=>dbms_service.failover_method_basic,
failover_type=>dbms_service.failover_type_session,
failover_retries=>160,
failover_delay=>5
);
end;
/

//查看dba_services中的状态
SQL> select name,FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY from dba_services;

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY
—————————— —————————— —————————— —————- ————–
SYS$BACKGROUND
SYS$USERS
seeddataXDB
seeddata
racXDB
rac.world
service1                     BASIC SELECT                                    170                        4
service2                  BASIC SESSION                                    160                          5
service2_PRECONNECT

–EOF—

dbca create service in rac

oracle rac环境中,service可以通过DBCA,srvctl来添加,这篇先简单介绍dbca创建service。

在GUI中,执行dbca会出现图形的窗口,如下:

 

//dbca创建完毕service之后,查看service的状态,数据库的数据字典中已经有了反应
SQL> show parameter service

NAME TYPE VALUE
———————————— ———– ——————————
service_names                            string                  service1

//再看看listener中是否有”反应”,service1.world就是这个反应

LSNRCTL> set current_listener listener_rac1
Current Listener is listener_rac1
LSNRCTL> services
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.201.149)(PORT=1521)(IP=FIRST)))
Services Summary…
Service “rac.world” has 2 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip.localdomain)(PORT=1521))
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
Instance “rac2”, status READY, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac2-vip.localdomain)(PORT=1521))
Service “racXDB.world” has 2 instance(s).
Instance “rac1”, status READY, has 1 handler(s) for this service…
Handler(s):
“D000” established:0 refused:0 current:0 max:1022 state:ready
DISPATCHER <machine: rac1, pid: 19622>
(ADDRESS=(PROTOCOL=tcp)(HOST=rac1.localdomain)(PORT=32827))
Instance “rac2”, status READY, has 1 handler(s) for this service…
Handler(s):
“D000” established:0 refused:0 current:0 max:1022 state:ready
DISPATCHER <machine: rac2, pid: 18599>
(ADDRESS=(PROTOCOL=tcp)(HOST=rac2.localdomain)(PORT=32811))
Service “rac_XPT.world” has 2 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip.localdomain)(PORT=1521))
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
Instance “rac2”, status READY, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac2-vip.localdomain)(PORT=1521))
Service “service1.world” has 1 instance(s).
Instance “rac1”, status READY, has 2 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip.localdomain)(PORT=1521))
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
The command completed successfully

//再看看CRS中是否已经注册service的信息?
crs_stat |more
有如下的片段,说明新创建的service已经注册到了ocr中
NAME=ora.rac.service1.rac1.srv
TYPE=application
TARGET=ONLINE
STATE=ONLINE on rac1

//使用srvctl看看ocr中的信息,查看service的信息,首选的节点是rac1,备选节点是rac2,TAF策略中的method是basic,
//但是没有显示type,retries,delay这三个TAF策略

[oracle@rac1 ~]$ srvctl config service -d rac -a
service1 PREF: rac1 AVAIL: rac2 TAF: basic

//再看看dba_services中记录的东西
SQL> select name,FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY from dba_services;

NAME                  FAILOVER_METHOD                      FAILOVER_TYPE             FAILOVER_RETRIES      FAILOVER_DELAY
—————————— —————————————- —————————————- —————- ————–
SYS$BACKGROUND
SYS$USERS
seeddataXDB
seeddata
racXDB
rac.world
service1

//dba_services中关于TAF的几个策略都没”反应”上,看来还需要dbms包来配置如下策略

(FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY)

//使用dbms_service包来添加TAF的策略

SQL> begin
2 dbms_service.modify_service(
3 service_name=>’service1′,
4 failover_method=>dbms_service.failover_method_basic,
5 failover_type=>dbms_service.failover_type_select,
6 failover_retries=>170,
7 failover_delay=>4
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

//再次查看dba_services中的TAF策略信息,retry、delay、FAILOVER_TYPE、FAILOVER_METHOD
SQL>  select name,FAILOVER_METHOD,FAILOVER_TYPE,FAILOVER_RETRIES,FAILOVER_DELAY from dba_services

NAME                               FAILOVER_METHOD                   FAILOVER_TYPE            FAILOVER_RETRIES FAILOVER_DELAY
—————————— ——————————     ——————————      —————-        ————–
SYS$BACKGROUND
SYS$USERS
seeddataXDB
seeddata
racXDB
rac.world
service1                BASIC                                          SELECT                                         170                                     4

 

–EOF–

ocr restore test

//rm the ocr file

[root@rac1 config]# rm -rf  ocr

[root@rac1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry

[root@rac1 bin]# ./crsctl stop crs
OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]

//the ocr is gone and the crs status is incorrect

//create a ocr file,as I use NFS as share storage,so just touch a ocr file

[root@rac1 bin]# touch /opt/app/config/ocr
//change the owner

chown root:oinstall /opt/app/config/ocr
use the physical backup to restore ocr

[root@rac1 bin]# ./ocrconfig -restore ../cdata/crs/backup00.ocr
start the crs
[root@rac1 bin]# ./crsctl start crs

//check the crs stack
[root@rac1 bin]# ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

//maybe, u need toe crs_stat -all to pull up all the resource

 

[oracle@rac2 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.rac.db application ONLINE ONLINE rac1
ora….c1.inst application ONLINE ONLINE rac1
ora….c2.inst application ONLINE ONLINE rac2
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2

 

–EOF–

how to backup vote disk and ocr

vote disk could be backup by dd,you can following this:

check the information msg of vote disk

[oracle@rac1 ~]$ crsctl query css votedisk
0. 0 /opt/app/config/votingdisk

located 1 votedisk(s).

the vote disk is located in /opt/app/config/votingdisk

using dd to backup the vote disk:

[oracle@rac1 ~]$ dd if=/opt/app/config/votingdisk of=votediskBak
20000+0 records in
20000+0 records out

[oracle@rac1 ~]$ ll vot*
-rw-r–r– 1 oracle oinstall 10240000 Aug 20 14:38 votediskBak

it’s quite easy.

How about the OCR? following below:

//show the backup information of OCR

[oracle@rac1 ~]$ ocrconfig -showbackup

rac2 2012/08/20 14:39:02 /opt/app/crs/product/10.2.0/crs/cdata/crs

rac2 2012/08/20 10:38:58 /opt/app/crs/product/10.2.0/crs/cdata/crs

rac2 2012/08/20 06:38:55 /opt/app/crs/product/10.2.0/crs/cdata/crs

rac2 2012/08/18 18:38:16 /opt/app/crs/product/10.2.0/crs/cdata/crs

rac2 2012/08/07 06:05:05 /opt/app/crs/product/10.2.0/crs/cdata/crs

//you can see the backup history information

[oracle@rac1 crs]$ ls -lrt
total 29428
-rw-r–r– 1 root root 4292608 Aug 7 06:12 week.ocr
-rw-r–r– 1 root root 4292608 Aug 14 07:44 week_.ocr
-rw-r–r– 1 root root 4292608 Aug 18 18:46 day.ocr
-rw-r–r– 1 root root 4292608 Aug 19 18:46 day_.ocr
-rw-r–r– 1 root root 4292608 Aug 20 06:46 backup02.ocr
-rw-r–r– 1 root root 4292608 Aug 20 10:46 backup01.ocr
-rw-r–r– 1 root root 4292608 Aug 20 2012 backup00.ocr

// the ocr was backup automaticlly every four hours as the basic backup policy which is showing above

//you need to use root to export the ocr config message as following

[root@rac1 bin]# ./ocrconfig -export /home/oracle/ocrBak.exp

if the ocr was corrupted,you can import the config of backup files into the ocr.

export and import is the logical backup and restore method,ocrconfig provide the “restore” command,but the “restore” command can only restore from the “physical backup” which is the backup made by the automatic policy.

-EOF-

 

the group and owner infor of OCR,VOTEING DISK

oracle clusterware安装过程中,需要ocr、voting disk,其宿主、权限一直存在误区,例如:660,640,oracle:oinstall等等。

今天在看”2 Day + Real Application Clusters Guide”才发现如下的截图:

2个ocr应该是root:oinstall的宿主设置,mode值应当是640。3个voting disk宿主是oracle:oinstall,而mode值也是640。

sdg,sdh,sdi都是给ASM instance使用,宿主为通常理解的oracle:oinstall,mode值为660.

细节2:

在文档中,有关于/etc/oratab的描述,

Modify the /etc/oratab file on each node in the cluster, adding in the
appropriate instance information.

Note: In a single-instance database, setti ng the last field of each entry
to N disables the automatic startup of a database when the server it
runs on is restarted. For an Oracle RAC database, these fields are set
to N because Oracle Clusterware starts the instances and processes,
not the dbstart utility.

从note的地方也可以看出,在rac环境下,instance的启动是通过crs来搞定的,而不是单实例下的dbstart。

这是rac与单实例的差别。

–EOF–

ora-610(2)

continue focus on ORA-610 errors issue, I got some clue from MOS, following as:

Cause

This is usually a resource problem. i.e. lack of resources at the OS level.

//beside, here are some workarounds you can take a look,but if the resource on os is going to exhausting, they are out of work.so , the key points is to enlarge the resource on OS.

Solution

You can try the following as solutions :

1. Increase the paramater JOB_QUEUE_PROCESSES.

2. Make sure Swap is correctly configured and increase it if necessary.

4 ~ 8 GBs of RAM, set Swap to at least double RAM size.

8~16 GBs of RAM, set Swap to double RAM.

16~32 GBs of RAM or more, set Swap at a rate of 0.75xRAM.

3. Make sure all the Kernel parameters are configured correctly and above minimum requirements, especially the maximum number of processes per non-root user.

4. Use the HCVE to check that all the minimum requirements are met.

Note 250262.1 – RDA 4 – Health Check / Validation Engine Guide

5. Check the ulimits and preferably set all to Unlimited.

6. Check the system log at the same time of the messages for errors/any indication of resource problem.

7. Check your SGA and PGA settings, make sure they are not set too low or too high. Preferably

PGA_AGGREGATE_TARGET is set at a rate of 2 MBs per process defined by the PROCESSES

parameter. An over allocation of resources could also cause the same.
One of the above solutions will most likely fix your problem but if none does, please log an SR with Oracle Support to assist you.

Hints:

You can monitor the PROCESSES usage at the database level using the V$RESOURCE_LIMIT view.

Also, if you monitor the database and see a lot of INACTIVE sessions, then you need to resolve that first before addressing the suggestions above, these sessions are lying idle eating up resources, and causing limitations, so the users need to make sure to exit their sessions gracefully, these sessions should be cleaned up regularly and perhaps consider using DCD and KeepAlive (Windows only).

Note 226202.1 – How to Set Keepalive on Windows NT/2000
Note 151972.1 – Dead Connection Detection (DCD) Explained

ORA-00610: Internal error code

一个客户的数据库遭遇报错,暂时到没有影响到客户端程序,发来了日志信息,让帮忙查看。
Tue Aug 14 14:22:20 2012
Errors in file c:oracleproduct10.2.0adminaaabdumpaaa_ora_3592.trc:
ORA-00610: Internal error code
Tue Aug 14 14:22:25 2012
Process J000 died, see its trace file
Tue Aug 14 14:22:25 2012
kkjcre1p: unable to spawn jobq slave process 
Tue Aug 14 14:22:25 2012
Errors in file c:oracleproduct10.2.0adminaaabdumpaaa_cjq0_3976.trc:
ORA-00600的internal报错经常见,610倒是第一次见到。在mos上大概搜了一下,发现与OS当时的资源消耗相关度比较大,问了客户当时的OS负载、操作等,当时os正经历IO负载较大的操作(大数据量的文件拷贝),资源相对紧张,但io的消耗增大也会对oracle产生影响吗?比较疑惑。
MOS上倒是有篇doc与症状很类似,如下:
ORA-610 INTERNAL ERROR CODE, KKJCRE1P: UNABLE TO SPAWN JOBQ [ID 436509.1]

Applies to:

Oracle Server – Enterprise Edition – Version: 11.1.0.6 and later   [Release: 11.1 and later ]
Information in this document applies to any platform.

Symptoms

When running a job from the scheduler on a very heavily loaded database you could receive an ora-610 error.  The job slave process creation takes a long time and raises a reported error as a fatal error in the background process.

The resulting trace file will have this information listed in it:

New process timed out
error 610 in job queue process
ORA-00610: Internal error code
Died during process startup with error 447 (seq=109)
OPIRIP: Uncaught error 447. Error stack:
ORA-00447: fatal error in background process

Cause

This is caused by lack of resources on the server.  If the system is extremely loaded, the timeout could be hit and the spawning error will occur. You will see this problem where a job slave process is not started is if the system is running out of resources (e.g. no memory left, max number of processes hit, etc.)

 

Solution

Ensure that there are enough resources on the server to satisfy all the scheduler/job queue requests.

症状非常类似,只是apply to :11.1.0.6让人疑惑,客户的环境是oracle10201
–EOF–

issue of yum config in rhel 5u5

rhel 5u5上打rpm包是非常烦人的事情,没有license,无法连接redhat的yum源。只好寻求centos的源帮忙。

进入/etc/yum.repos.d目录,里面有一个缺省的源rhel-debuginfo.repo,删掉、重建吧,里面的内容帮助不大。

[molin@localhost downloads]$ cd /etc/yum.repos.d

[molin@localhost downloads]$ vi rhel-debuginfo.repo

文件的内容如下:

[base]
name=CentOS-5 – Base

baseurl=http://ftp.sjtu.edu.cn/centos/5/os/$basearch/
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-centos5

这个时候如果你直接yum install rpm包的话,会遭遇报错:

warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key………….

还需要重新import key:

rpm –import http://centos.ustc.edu.cn/centos/RPM-GPG-KEY-CentOS-5

 

[root@hundsun ~]# yum install libaio-devel
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
Setting up Install Process
Resolving Dependencies
–> Running transaction check
—> Package libaio-devel.i386 0:0.3.106-5 set to be updated
—> Package libaio-devel.x86_64 0:0.3.106-5 set to be updated
–> Finished Dependency Resolution

Dependencies Resolved

================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
libaio-devel i386 0.3.106-5 base 12 k
libaio-devel x86_64 0.3.106-5 base 11 k

Transaction Summary
================================================================================
Install 2 Package(s)
Upgrade 0 Package(s)

Total download size: 23 k
Is this ok [y/N]: y
Downloading Packages:
(1/2): libaio-devel-0.3.106-5.x86_64.rpm | 11 kB 00:00
(2/2): libaio-devel-0.3.106-5.i386.rpm | 12 kB 00:00
——————————————————————————–
Total 6.8 kB/s | 23 kB 00:03
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : libaio-devel 1/2
Installing : libaio-devel 2/2

Installed:
libaio-devel.i386 0:0.3.106-5 libaio-devel.x86_64 0:0.3.106-5

Complete!

现在可以正常的yum了,当然上述的配置都是建立在这样的前提条件的:”能够ping通外网”

 

 

reading”RAC and Clusterware Best Practice and Start Kit on windows”

最近在读如下这篇note时,才发现及时在windows平台上,oracle rac 需要注意、配置的地方还是很多的。参考如下这篇”best practice”
To BottomTo Bottom

Jul 13, 2012BULLETINPUBLISHED1
There are no commentsComments (0) Rate this document Email link to this document Open document in new window Printable Page

In this Document

Purpose
Scope
Details
RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)
RAC Platform Specific Starter Kits and Best Practices
RAC on Windows Step by Step Installation Instructions
RAC on Windows Best Practices
OS Configuration Considerations
Network Considerations
Storage Considerations
Hardware/Vendor Specific Considerations
Oracle Software Considerations
Installation
General
References

 

Applies to:

Oracle Server – Enterprise Edition – Version 10.2.0.1 to 11.2.0.1.0 [Release 10.2 to 11.2]
Microsoft Windows x64 (64-bit)
Microsoft Windows (32-bit)
Microsoft Windows Itanium (64-bit)

Purpose

The goal of the Oracle Real Application Clusters (RAC) series of Best Practice and Starter Kit notes is to provide customers with quick knowledge transfer of generic and platform specific best practices for implementing, upgrading and maintaining an Oracle RAC system. This document is compiled and maintained based on Oracle’s experience with its global RAC customer base.

This Starter Kit is not meant to replace or supplant the Oracle Documentation set, but rather, it is meant as a supplement to the same. It is imperative that the Oracle Documentation be read, understood, and referenced to provide answers to any questions that may not be clearly addressed by this Starter Kit.

All recommendations should be carefully reviewed by your own operations group and should only be implemented if the potential gain as measured against the associated risk warrants implementation. Risk assessments can only be made with a detailed knowledge of the system, application, and business environment.

As every customer environment is unique, the success of any Oracle Database implementation, including implementations of Oracle RAC, is predicated on a successful test environment. It is thus imperative that any recommendations from this Starter Kit are thoroughly tested and validated using a testing environment that is a replica of the target production environment before being implemented in the production environment to ensure that there is no negative impact associated with the recommendations that are made.

Scope

//适用的范围:不仅仅是RAC的安装,在后续的升级同样适用

This article applies to all new and existing RAC implementations as well as RAC upgrades.

Details

RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)

The following document focuses on RAC and Oracle Clusterware Best Practices that are applicable to all platforms including a white paper on available RAC System Load Testing Tools and RAC System Test Plan outlines for 10gR2 & 11gR1 and 11gR2:

Document 810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)

 

RAC Platform Specific Starter Kits and Best Practices

The following notes contain detailed platform specific best practices including Step-By-Step installation cookbooks (downloadable in PDF format):

Document 811306.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Linux)
Document 811280.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)
Document 811271.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Windows)
Document 811293.1 RAC and Oracle Clusterware Best Practices and Starter Kit (AIX)
Document 811303.1 RAC and Oracle Clusterware Best Practices and Starter Kit (HP-UX)

 

RAC on Windows Step by Step Installation Instructions

Click here for a Step By Step guide for installing Oracle RAC 10gR2 on Windows (2003 and 2008)
Click here for a Step By Step guide for installing Oracle RAC 11gR1 on Windows (2003 and 2008)
Click here for a Step By Step guide for installing Oracle RAC 11gR2 on Windows (2003 and 2008)

RAC on Windows Best Practices

The Best Practices in this section are specific to the Windows Platform. That said, it is essential that the Platform Independent Best Practices found in Document 810394.1 be reviewed in addition to the content provided in this Document.

OS Configuration Considerations

//杀毒软件是需要注意的一个地方,也许你费劲巴拉debug半天,才发现oracle自己的某个文件被放到了隔离区中,那是很郁闷的一件事情

  • Disable Anti-Virus software running on cluster nodes before and for the entire duration of the installation of Oracle on Windows (be mindful of manual reboots during this time).  Anti-Virus software may of course be re-enabled following the installation but the following should be excluded from being scanned:
    • Oracle Software directories
    • OCFS filesystems
    • ACFS filesystems
    • Network scanning of the private interconnect
  • Disable the Windows firewall.  If the Windows firewall must be re-enabled, it MUST NOT be configured for the private network.
  • //防火墙和磁盘的加密,都会给你带来阻碍。
  • The use of Disk Encryption software on RAC servers is highly discouraged and is to be avoided as Disk encryption software has been known to cause problems during multi-node installation and patching.
  • For Windows 2008 systems modify the elevation prompt behavior for administrators to “Elevate without prompting”:
    1. Open a command prompt and type 'secpol.msc' to launch the Security Policy Console management utility.
    2. From the Local Security Settings console tree, click Local Policies, and then Security Options
    3. Scroll down to and double-click User Account Control: Behavior of the elevation prompt for administrators.
    4. From the drop-down menu, select: "Elevate without prompting (tasks requesting elevation will automatically run as elevated without prompting the administrator)"
    5. Click OK to confirm the changes
  • Ensure that the Administrators group has the ability to manage auditing and security logs:
    1. Open a command prompt and type 'secpol.msc' to launch the Security Policy Console management utility.
    2. Click on 'Local Policies'
    3. Click on 'User Rights Assignment'
    4. Locate and double click the 'Manage auditing and security log' in the listing of User Rights Assignments.
    5. If the Administrators group is NOT listed in the 'Local Security Settings' tab, add the group now.
    6. Click OK to save the changes (if changes were made)
  • Set /USEPMTIMER in the boot.ini to prevent excessive LMD and LMS trace generation and to prevent connectivity issues as described in Document 437101.1.
  • //desktop heap也是头一次听说,和很多memory exhaustion相关。
  • Increase the size of the default Non-Interactive Desktop Heap to 1MB to prevent instability due to Desktop Heap exhaustion.  Information on how to increase this value can be found in Document 744125.1 and Microsoft Knowledge Base Article KB947246.   It is advised that you consult with Microsoft for further tuning of the Non-Interactive Desktop Heap beyond 1MB.
  • //而且官方的文档建议:最好在cmd窗口、会话级别来设置ORACLE_HOME,而不是整个系统的环境变量中设置。
  • Do not set ORACLE_HOME as an environment variable in Oracle (RAC on) Windows environments. If needed (for example, when running opatch) set this variable as needed in a command prompt window.  Reference Document 969581.1.  In particular, this can cause undesired behavior (for example:  listeners starting under the wrong ORACLE_HOME) during upgrade.
  • //在系统的高级设置中,需要将best performance设置为”program”
  • Windows 2000 and 2003 systems should be optimized for Memory Usage of Programs not System Caching (not an option in 2008):
    Start -> Settings -> Control Panel -> System -> Advanced -> Performance -> Memory Usage: Adjust for best performance of -> Programs instead of System Caching
  • Run Perfmon to monitor CPU, Memory, Network, Disk IO Rates – To aid in troubleshooting, configure Perfmon to monitor these OS statistics and to generate binary log files (.BLG). Instructions for implementing this change can be found on the Microsoft support website using the following link: http://support.microsoft.com/kb/146005
  • Download and install Debugging Tools for Windows (containing, among others, adplus and windbg) on each node of your RAC on Windows cluster.  These tools can be an invaluable resource when troubleshooting complex issues.  The downloads and instructions for implementation can be found on the Download and Install Debugging Tools for Windows MSDN website.
  • Download and familiarize the DBA team with useful Sysinternals Windows utilities such as Process Explorer.  These utilities are available on the Microsoft Sysinternals Website.
  • Keep memory allocation under 80%. We recommend shooting for 75% allocated, that is, more than 20-25% free. This will allow for ample memory needed for Windows OS operations (including collection of physical memory dumps if required).
  • There is a general requirement for Oracle RAC to synchronize the time on all nodes.  If the Windows Time Service is being used, modify the Windows Time service settings to avoid large jumps in time and allow the time to gradually match with the reference time. Restart the Windows Time service after you complete this task.
1. Open a command prompt (as the Admin user) and type 'regedit'.
2. Within the registry editor locate the HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesW32TimeConfig key.
3. Set the value for MaxPosPhaseCorrection = 600
4. Set the value for MaxNegPhaseCorrection = 600
5. Set the value for MaxAllowedPhaseOffset = 600
Note that with these values set, if there is a time discrepancy between a cluster node and the reference node that is greater than 10 minutes, time will not be adjusted and a message will be logged (to the Windows Event Viewer)*  Consider also setting the ‘updateinterval’ parameter.  Reference:  http://technet.microsoft.com/en-us/library/cc773263%28v=ws.10%29.aspx
** Consider also setting ‘Maximum Tolerance for Computer Clock Synchronization’ (Reference: http://technet.microsoft.com/en-us/library/cc779260%28v=WS.10%29.aspx). Note thatKerberos Authentication will failif the time difference between the nodes is greater than the ‘Maximum Tolerance for Computer Clock Synchronization’.

Note:  With 11gR2, Cluster Time Synchronization Daemon (CTSSD) can be used in place of Windows Time Service. CTSSD will synchronize time with a reference node in the cluster when Windows Time Service is not found to be configured. Should you require synchronization from an external time source you must use Windows Time Service which will cause CTSSD to run in “observer” mode. However, if Windows Time Service is running, then it must to be configured as shown above.

Network Considerations

  • The Public network MUST be listed first in the network interface binding order.  To make this change perform the following:
    1. Click Start, click Run, type 'ncpa.cpl', and then click OK.
    2. In the menu bar on the top of the window click 'Advanced' and choose 'Advanced Settings' (For Windows 2008, if the "Advanced" is not showing, click 'Alt' to enable that menu item).
    3. Under the Adapters and Bindings tab use the up arrow to move the Public interface to the top of the Connections list.
    4. Under Binding order for increase the priority of IPv4 over IPv6
    5. Click OK to save the changes
  • DHCP Media Sense MUST be disabled. This change must be manually implemented for Windows 2000 but is disabled by default in 2003. Additional information (including instructions for disabling) for Windows 2000 and Windows 2003 can be found in MS Knowledge Base Article KB239924. For Windows 2008, this feature is once again enabled. To disable DHCP Media Sense on 2008, execute the following from a command window as the Administrator user:
    C:UsersAdministrator> netsh interface ipv4 set global dhcpmediasense=disabled
    C:UsersAdministrator> netsh interface ipv6 set global dhcpmediasense=disabled
    Validate the change with:C:UsersAdministrator> netsh interface ipv4 show global
    C:UsersAdministrator> netsh interface ipv6 show global

    //”SNP feature捣乱的问题暂时没有遇到,但是隐约觉得客户那边很多不确定、不容易重现的问题会和这个设置有关”

  • After installing Windows Server 2003 Service Pack 2 (SP2) or Windows Server 2003 Scalable Networking Pack (SNP), turn off default SNP features. On a computer that has a TCP/IP Offload-enabled network adapter, you may experience many network-related problems like network adapters consuming lots of nonpaged pool memory or adapters requesting large blocks of contiguous memory causing the computer to stop responding when it tries to free the memory. This problem also affects Windows 2008 operating systems.  See My Oracle Support Document 988008.1 and Microsoft Knowledge Base Articles KB948496 andKB951037 for details around this issue and instructions on how to take corrective action.  Essentially, you can apply this recommendation by issuing the following commands:
C:UsersAdministrator> netsh int tcp set global chimney=disabled
C:UsersAdministrator> netsh int tcp set global rss=disabled
Validate these changes with:C:UsersAdministrator> netsh interface ipv4 show global
C:UsersAdministrator> netsh interface ipv6 show global

  • Do not use the names:  PUBLIC and PRIVATE (all caps) for your public and interconnect networks (NICs) due to unpublished Bug 6844099.  The words public and private themselves may be used, for example:  Public and Private are acceptable.
  • Network interface names in ‘Network Connections’ (under Control PanelAll Control Panel ItemsNetwork Connections) must match ‘name’ as indicated by ‘Ethernet adapter <name>’ in the ipconfig /all output.
  • Note that there is currently no plan to enable the (11.2) Redundant Interconnect Usage feature and HAIPs on Windows.

Storage Considerations

//杀毒软件还是非常凶险的东西,参考如下标红的部分

  • It is strongly advised to bring the entire Oracle software stack down in order to complete all disk virus scans for conventional Fat 16/32/NTFFS as well as OCFS file systems. This is because Oracle and the virus scan software use different types of locking which are not compatible.  Hence a shared disk for the database configured with OCFS could have 2 nodes virus scanning at the same time and could potentially cause the cluster to crash. We strongly suggest that you virus scan only from one node and only during maintenance windows.  OCFS disks that only contain Oracle database datafiles do not need to be virus scanned.  OCFS disks that contain any non-database datafiles or database configuration files should be scanned periodically (with the entire Oracle stack down).
  • Desupport of the Oracle Cluster File System (OCFS) on Windows is final with Oracle Database 12.  Customers currently using OCFS on Windows to host either the Oracle cluster files (Oracle Cluster Registry – OCR – and Voting Files) or database files or both will need to migrate these files off OCFS prior to upgrading to Oracle Database 12.  See My Oracle Support Document 1392280.1 for more details.

Hardware/Vendor Specific Considerations

  • Ensure minimum BIOS version 2.35.3.3 is used for SUN V40Z DUAL CORE machines, for ECC memory checking.
  • Ensure SUN V40Z 2.6V memory management voltage regulator issues. A SUN CE can identify if the voltage regulator is beginning to fail. The new VRM (Voltage Regulator Module) revision board from rev 1.0 to rev 2.0.

Oracle Software Considerations

The Software Considerations in this section are specific to the Windows Platform. That said, it is highly recommended that the Platform Independent Best Practices found in Document 810394.1 be reviewed in addition to the content below.

Installation

  • Prevent installation failures by disabling the Windows Firewall prior to installation of Oracle, this applies to all Oracle versions.  See the OS Configuration Considerations section within this note for details.
  • Prevent installation failures by disabling Anti-Virus software prior to installation of Oracle, this applies to all Oracle versions. See the OS Configuration Considerations section within this note for details.
  • Prevent installation failures by stopping ‘Distributed Transaction Coordinator’ and ‘Windows Management Instrumentation’ (WMI) services (on each node) prior to installation or patching.  Note that in some cases it may be required to actually disable WMI to allow patching.
  • Prepare for resolution of potential locked file issues by downloading ‘Process Explorer’ from Microsoft’s sysinternals website prior to installation or patching.

General

  • Be sure that the latest windows bundle patch has been applied to ensure optimal performance and stability of the system.  This applies to ALL Oracle releases.  The latest available patch bundles can be found inDocument 161549.1.
  • Note that as a normal function of our Oracle Clusterware / Grid Infrastructure, OraFenceService is designed to fence (I/O) and reboot a node if it perceives that that node is ‘hung’ once its configured timeout has been reached. The default timeout for the OraFence driver is a (very low) 5 seconds. What this means is that if the OraFence driver detects what it perceives to be a hang at the operating system level and that hang persists beyond 5 seconds, it’s possible that the OraFence driver – of its own accord – will fence and evict the node.  It is advisable in some cases to increase the OraFence timeout value as high as 10 seconds in some cases.  The OraFence timeout is controlled by the following Windows registry key:  HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesOraFenceServiceTimeout

Database – RAC/Scalability Community
To discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Database – RAC/Scalability Community

References

NOTE:1392280.1 – Desupport of Oracle Cluster File System (OCFS) on Windows with Oracle Database 12
NOTE:437101.1 – EXCESSIVE LMS AND LMD TRACE FILE SIZES GENERATED ON WINDOWS RAC
NOTE:744125.1 – Connections Fail with ORA-12640 or ORA-21561
NOTE:810394.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)
NOTE:988008.1 – RAC on Windows: Recurring Node Evictions May Be Caused by Default SNP Features Available for Windows Server 2003 SP2 and 2008
NOTE:811280.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)
NOTE:811306.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (Linux)
NOTE:969581.1 – How to Set or Switch Oracle Homes on Windows
NOTE:811271.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (Windows)
NOTE:811293.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (AIX)
NOTE:811303.1 – RAC and Oracle Clusterware Best Practices and Starter Kit (HP-UX)
BUG:6844099 – OUI RETURNS WRONG VALUE WHEN NETWORKINTERFACE IS CALLED PUBLIC, PRIVATE, UNKNOWN

oracle rac spfile change attention

the customer will meet this kind of situation to change the path of spfile in a rac environment, such as you need to change the old disk array or something else,just pay attention when execute the implementation.

you know there are something more to consider about when you operate in a rac environment comparing with single instance.It’s better to use “srvctl” tool to do operations,as many configuration was recorded in OCR, the sqlplus will take no effect on it,u need to use srvctl tools.

for example:

you want toe change the path of spfile for each nodes,it’s better to do following behind:

//to show the path registered in OCR

[oracle@rac1 ~]$ srvctl config database -d rac -a
rac1 rac1 /opt/app/oracle/product/10.2.0/db_1
rac2 rac2 /opt/app/oracle/product/10.2.0/db_1
DB_UNIQUE_NAME: rac
DB_NAME: null
ORACLE_HOME: /opt/app/oracle/product/10.2.0/db_1
SPFILE: /opt/app/oradata/rac/spfile.dbf
DOMAIN: world
DB_ROLE: null
START_OPTIONS: null
POLICY: AUTOMATIC
ENABLE FLAG: DB ENABLED

——————————————————————-

do confirm the correctness of the path in ocr and the exact configuration in your envrionments.

or use following cmd to make a change

[oracle@rac1 ~]$ srvctl modify database -d rac -p ‘$pathOfSpfile’

 

if you don’t……the node can not be bring up normally and u will receive  errors 

CRSAPP][2592017296]0StartResource error for ora.ora10.ora101.inst error code = 1
[  CRSRES][2592017296]0Start of `ora.ora10.ora101.inst` on member `node1` failed.
[  CRSRES][2592017296]0node2 : CRS-1019: Resource ora.ora10.ora101.inst (application) cannot run on node2

[  CRSRES][2623486864]0Start of `ora.ora10.ora102.inst` on member `node2` failed.
[  CRSRES][2623486864]0node1 : CRS-1019: Resource ora.ora10.ora102.inst (application) cannot run on node1

[  CRSAPP][2633976720]0StartResource error for ora.ora10.db error code = 1
[  CRSRES][2633976720]0Start of `ora.ora10.db` on member `node1` failed.
[  CRSRES][2633976720]0Attempting to start `ora.ora10.db` on member `node2`
[  CRSRES][2675936144]0startRunnable: setting CLI values