oracle rac vip failover test

客户的核心系统跑在一个双节点的RAC上,希望尽量不停止业务,在一个节点做一些硬件维护工作,咨询我停了某个几点是否会有影响,我简单测了一下,把测试的结果告诉客户,详细如下:

以root停掉一个节点的CRS,./crsctl stop crs

查看vip漂移的情况,可以看见别停掉的节点2上的vip漂移到node1上了,这也是rac最基本的功能了,所谓的HA,不就是最小降低对外的影响吗。

[oracle@node1 ~]$ crs_stat -t
Name           Type           Target    State     Host
————————————————————
ora….E1.lsnr application    ONLINE    ONLINE    node1
ora.node1.gsd  application    ONLINE    ONLINE    node1
ora.node1.ons  application    ONLINE    ONLINE    node1
ora.node1.vip  application    ONLINE    ONLINE    node1
ora….E2.lsnr application         ONLINE    OFFLINE
ora.node2.gsd  application    ONLINE    OFFLINE
ora.node2.ons  application    ONLINE    OFFLINE
ora.node2.vip  application    ONLINE    ONLINE    node1
ora.rac.db     application    ONLINE    ONLINE    node1
ora….c1.inst application    ONLINE    ONLINE    node1
ora….c2.inst application    ONLINE    OFFLINE
 

在看看node1上网卡的信息

 

[root@node1 ~]# ifconfig -a|more
eth0      Link encap:Ethernet  HWaddr 08:00:27:CB:AB:70
          inet addr:192.168.201.128  Bcast:192.168.201.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fecb:ab70/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:187031975 errors:0 dropped:0 overruns:0 frame:0
          TX packets:175704525 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:59531583315 (55.4 GiB)  TX bytes:54856023944 (51.0 GiB)

eth0:1    Link encap:Ethernet  HWaddr 08:00:27:CB:AB:70
          inet addr:192.168.201.149  Bcast:192.168.201.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth0:2    Link encap:Ethernet  HWaddr 08:00:27:CB:AB:70
          inet addr:192.168.201.150  Bcast:192.168.201.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:
1

eth1      Link encap:Ethernet  HWaddr 08:00:27:07:BC:09
          inet addr:172.10.0.1  Bcast:172.10.255.255  Mask:255.255.0.0
          inet6 addr: fe80::a00:27ff:fe07:bc09/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16525219 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16380484 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2112900843 (1.9 GiB)  TX bytes:2416791104 (2.2 GiB)

————————————————————————————————–

测试结果很简单明了,vip的resource、vip都能够切换到健康节点上。而且down机的节点在恢复之后,vip会自动的切换会恢复的节点上

—EOF–

patch apply to solve the startup hang of windows server with cpu more than 64 cores

之前遇到了:拥有很多cpu核数的windows服务器,在dbca建库时hang住的问题,详细的信息可以参考上一篇文章:

http://www.bangela.org/?p=914

下面讲讲其中patch apply的解决方案,当然你也可以disable多余的cpu来解决。

安装10204的软件,然后尝试dbca建库,果然遇到了startup hang的问题,服务器有48个逻辑的core。如下是在试图使用pfile启动库的时候,遭遇的hang住。
 
[Thread-11] [13:15:18:828] [CloneRmanRestoreStep.executeImpl:217]  Instance Creation went fine……….
[Thread-11] [13:15:18:828] [CloneRmanRestoreStep.executeImpl:224]  db_recovery_file_dest=E:oracleproduct10.2.0flash_recovery_area
[Thread-11] [13:15:18:828] [CloneRmanRestoreStep.executeImpl:227]  db_recovery_file_dest_size=2147483648
[Thread-11] [13:15:19:734] [SQLEngine.setSpool:1750]  old Spool  = null
[Thread-11] [13:15:19:734] [SQLEngine.setSpool:1751]  Setting Spool  = E:oracleproduct10.2.0db_1cfgtoollogsdbcatestCloneRmanRestore.log
[Thread-11] [13:15:19:734] [SQLEngine.setSpool:1752]  Is spool appendable? –> true
[Thread-11] [13:15:19:734] [CloneRmanRestoreStep.executeImpl:320]  starting with pfile=E:oracleproduct10.2.0admintestpfileinit.ora
 
patch apply 10008104
 
按照patch介质中的readme指引进行操作。拷贝其要求的Opatch工具,因为readme中对于opatch工具的版本有相应的要求。
 
停止windows服务中的service(oracleServiceSID),如果有的话,然后开始patch apply的过程,严格参考readme就可以了。
 
cd 10008104
opatch apply
 
查看执行期间的log信息
信息:Start the Apply postScript at Tue Nov 27 13:34:01 CST 2012
信息:执行 ' Command and arguments of pre/post scripts is restricted to be viewed in log file ':
 
返回代码 = 0
信息:Finish the Apply postScript at Tue Nov 27 13:34:02 CST 2012
信息:
本地系统已打补丁, 可以重新启动。
 
信息:Finish applying patch to local system at Tue Nov 27 13:34:02 CST 2012
信息:Finishing ApplySession at Tue Nov 27 13:34:02 CST 2012
信息:Total time spent waiting for user-input is 35 seconds.  Finish at Tue Nov 27 13:34:02 CST 2012
 
 
patch apply完毕后,启动windows中的oracle服务(如果没有dbca建库,则不会有oracleServiceSID服务)
 
再次启动dbca,尝试建库,轻松创建成功。
 
这个startup hang的问题还确实挺诡异的,hang住的问题,界面一般没有告警,比较不容易入手诊断。
 
–EOF–
 

Database startup hangs on Windows when machine has more than 64 cores

之前的客户遇到问题,平台是windows server 2008 R2 x64,安装oracle10.2.0.4这个版本,软件的安装顺畅无比,只是在dbca建库的时候hang住了,停在了2%附近,长时间没有反应。我让他仔细看看安装介质zip中的readme文件,里面明确的写着如下的提示信息:

If installing on Windows 7 or Windows Server 2008 R2, then 10.2.0.5 patchset or higher must be used. (My Oracle Support)

于是建议打上10205的补丁,再次dbca建库的时候就ok了。但这里强调的并不是readme的遗漏,有个bug与windows平台相关,这就是在server具备很多个cpu core时,会在启动instance的时候hang住,详细的bug信息如下,需要10204 patch 39才能修复,或则将一些cpu在bios上disable,这个bug主要发生在10204上,而与win2k3或者win2k8不相关。。

 

Bug 9772171  Database startup hangs on Windows when machine has more than 64 cores

 This note gives a brief overview of bug 9772171. 


 The content was last updated on: 17-SEP-2011


 Click here for details of each of the sections below.

 

Affects:

Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions BELOW 12.1
Versions confirmed as being affected
Platforms affected
  • Windows 64bit (AMD64/EM64T)
  • Windows/NT/XP

Fixed:

This issue is fixed in

Symptoms:

Related To:

Description

On Windows machines with more than 64 logical cores, the Oracle database
(oracle.exe) hangs on startup.

Rediscovery Notes:
Oracle process hangs on startup in a machine with 64+ CPUs.

Workaround
None

 

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

 

References

Bug:9772171 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article

Certification Information for Oracle Database on Microsoft Windows x64 (64-bit)

为客户实施,不太确定oracle 10g在win2k8上的支持情况,去support.oracle.com上找了文档,仅供参考:

 

看来在win2k8上实施oracle10g需要用10204的介质安装,而且最好打补丁到10205,如下

 

In this Document

  Purpose
  Details
  Certification Information for Oracle Database on Microsoft Windows x64
  Platform Information
  Product Information
  Release Information
  Certification Details for Specific Releases

 

Applies to:

Oracle Server – Enterprise Edition – Version 9.2.0.1 to 11.2.0.2 [Release 9.2 to 11.2]


Oracle Server – Standard Edition – Version 9.2.0.1 to 11.2.0.3 [Release 9.2 to 11.2]


Microsoft Windows x64 (64-bit) – OS Version: 7


Microsoft Windows x64 (64-bit)


Microsoft Windows x64 (64-bit) – Version: 2008 R2


Checked for relevance on 15 June 2012

 

Purpose

This document collects into one place all the individual platform and certification notes for Database on Microsoft Windows x64 (64-bit) that were in the older My Oracle Support certification system. This document can be used as a standalone reference and index to important details about the certification of all active Database releases on the Microsoft Windows x64 (64-bit) platform. In addition, all individual certifications will refer to this note so there is a single consistent source of certification details.

Details

Certification Information for Oracle Database on Microsoft Windows x64

This document contains general information pertaining to all releases of the Oracle Database, and information about individual releases

Platform Information

32-bit and 64-bit AMD64/EM64T Information

32-bit Oracle Database Server and Client running on 32-bit Windows is certified on AMD64/EM64T hardware with the exact same Windows versions as on x86 hardware.  However, 32-bit Oracle Database server isnot certified on Windows x64.  32-bit Oracle Database Client is generally certified on the same Windows x64 versions as on 32-bit Windows beginning with the Oracle 10.2 version.  Check the certification matrix for specific 32-bit Oracle Database Client certifications.

Microsoft Windows Small Business Server

Windows Small Business Server is a product suite that bundles several Microsoft products, including the Windows operating system. Oracle Database does not certify with product suites, only operating systems and platforms. To distinguish whether Oracle Database certifies with a particular Windows Small Business Server version, find that suite's operating system version and consult My Oracle Support's certification matrix using that operating system version.

Windows 7 O/S Information:

11gR1 database and client database certification are not planned for Windows 7. If a particular release is certified on Windows 7, it is certified on the following editions:

  • Windows 7 Professional (32-bit and x64)
  • Windows 7 Enterprise (32-bit and x64)
  • Windows 7 Ultimate (32-bit and x64)

Windows 2008 O/S Information:

11gR1 database and client database certification are not planned for Windows 2008 R2. If a release is certified on Windows Server 2008 R2 it will be with the following editions:

  • Windows Server 2008 R2 Standard (x64)
  • Windows Server 2008 R2 Enterprise (x64)
  • Windows Server 2008 R2 Datacenter (x64)
  • Windows Web Server 2008 R2 (x64)
  • Windows Server 2008 R2 Foundation (x64)

If a particular release is certified on Windows Server 2008, it is certified on the following editions:

  • Windows Server 2008 Standard (x86 and x64)
  • Windows Server 2008 Enterprise (x86 and x64)
  • Windows Server 2008 Datacenter (x86 and x64)
  • Windows Web Server 2008 (x86 and x64)
  • Windows Server 2008 Standard without Hyper-V (x86 and x64)
  • Windows Server 2008 Enterprise without Hyper-V (x86 and x64)
  • Windows Server 2008 Datacenter without Hyper-V (x86 and x64)

Oracle Database is not certified with Hyper-V. Oracle Database does not support Server Core.  Hyper-V and Server Core are features in Windows Server 2008 and Windows Server 2008 R2.

Windows Vista O/S Information:

Oracle Database (EE, SE, PE and Client) are supported on these platforms:

  • Business edition
  • Enterprise Edition
  • Ultimate Edition
  • 32-bit Oracle Database 10g Release 2 (10.2.0.3.0) for Microsoft Windows Vista with part numbers B42754-01 and B42755-01, released in 2007, remains certified. Customers may remain on this version and continue to receive Oracle Support.

Windows 2003 O/S Information:

Oracle Database (EE, SE, PE and Client) are supported on these Windows Server 2003 and Windows Server 2003 R2 editions:

  • Standard Edition (32-bit and x64)
  • Enterprise Edition (32-bit and x64)
  • Datacenter Edition (32-bit and x64)
  • Web Edition (32-bit)

Service Pack Information

Oracle certifies against the specific Microsoft operating system (OS) and, if applicable, service packs (SPs) stated in the Oracle product documentation. Oracle will support the use of our products on any later SP or OS security patch as soon as that SP or patch becomes generally available. Depending upon the severity, quantity, and impact of the SP or patch-related issues found, Oracle may recommend that customers wait until relevant Oracle patches have been released before upgrading to a particular SP or OS patch. Oracle may recommend or discourage the installation of specific SPs or patches if it will significantly affect the operation of Oracle software, either positively or negatively. If such a statement is deemed necessary, then Oracle Development will disseminate this statement in as timely a fashion as possible after the release of the SP or patch.

Product Information

For general information relating to certification for the Oracle Database product, including virtualization, interoperability, binary compatibility, general release and patch set information, please see Core Database Certification Information (Doc ID 1306539.1).

Release Information

Patch set release information can be found in Release Schedule of Current Database Releases [ID 742060.1]

Certification Details for Specific Releases

Client interoperability – please review Client / Server / Interoperability Support Between Different Oracle Versions (Doc ID 207303.1)

11.2

11.1

10.2

  • Do not install Oracle Database 10g Release 2 on Windows Server 2008, Server 2008 R2, 7, nor Vista using the 10.2.0.1 version. You must use the 10.2.0.4 (64-bit) x64 version specifically created for these operating systems. After installation, for Windows 7 and Windows Server 2008 R2, patchset 10.2.0.5 or higher must be applied.  For more information on installation see the release notes sections on the installation media:

     

oracle dataguard physical standby test(2)

 

不废话了,直接进入动手配置阶段
 
//primary主库配置部分
3.1.3 configure a standby redo log
online redo logfiles的数量=3个 ,(3+1)*1(thread数量),也就是standby redo log group设置为4组
 
alter database add standby logfile group 4 ('c:beijingstandbyredologstb_redo04.log') size 50M
alter database add standby logfile group 5 ('c:beijingstandbyredologstb_redo05.log') size 50M
alter database add standby logfile group 6 ('c:beijingstandbyredologstb_redo06.log') size 50M
alter database add standby logfile group 7 ('c:beijingstandbyredologstb_redo07.log') size 50M
 
配置主库的参数文件
alter system set log_archive_config='dg_config=(beijing,shanghai)' scope=spfile
alter system set log_archive_dest_1='location=c:arch1beijing valid_for=(all_logfiles,all_roles) db_unique_name=beijing' scope=spfile
alter system set log_archive_dest_2='service=shanghai LGWR ASYNC valid_for=(online_logfiles,primary_role) db_unique_name=shanghai' scope=spfile
//上面两个log_archive_dest_n的参数设置一定要注意空格的问题,否则会遇到启动报错。
 
alter system set log_archive_dest_state_1=enable scope=spfile
alter system set log_archive_dest_state_2=enable scope=spfile
 
//log_archive_dest_state_n在10g上缺省就是enable的
 
alter system set remote_login_passwordfile=exclusive scope=spfile
//remote_passwordlife在10g上缺省就是exclusive的
 
alter system set log_archive_max_processes=30 scope=spfile
 
//主库配置standby role部分的参数:当primary的库切换成为standby角色之后,需要配置这两个参数完成"log的生成源头"与"log的索取发起端"设置。
 
//primary与standby切换之后,原来的beijing就作为standby了,它是log的请求方,而shanghai则作为log的生成方、源头。
 
alter system set fal_server=shanghai scope=spfile
alter system set fal_client=beijing scope=spfile
 
db_file_name_convert和log_file_name_convert我觉得对于primary和standby在两台设备,而且目录结构一样的,不适用,不需要这两个参数。但是,我们这里的demo是会创建两个instance,主库叫beijing,standby库叫shanghai,对应的数据文件、redolog目录肯定有变化(oradata/beijing),所以还要按照如下配置。前面写shanghai是因为,这里指前换后的primary,后面的beijing是指"切换后"的standby,注意领会其中的意思。
 
alter system set db_file_name_convert='shanghai','beijing' scope=spfile
alter system set log_file_name_convert='shanghai','beijing' scope=spfile
 
alter system set standby_file_management=auto scope=spfile
//standby_file_manegement的作用主要是后续在primary创建tablespace、datafile时,能够在standby正常的在相应路径下生成文件(primary、standby路径不一样)
 
———–primary的部分配置完毕了,开始进行standby的配置————-
 
3.2
 
//先备份主库的数据文件
关闭主库,冷拷贝即可
 
//创建standby controlfile
启动主库到mount阶段,创建控制文件
 
alter database create standby controlfile as 'c:standby01.ctl'
 
//为standby数据库准备初始化的参数文件
需要对生成的init文件更改一些部分的设置(最主要的是将'beijing'部分变更为'shanghai',因为要确保路径的正确)
db_unique_name肯定需要修改为shanghai,而不是beijing(db_name不变),因为手工创建的instance name是shanghai。
control_file需要注意相关的目录,将beijing字段修改为shanghai,同时control01.ctl需要更改成在主库上创建的那个standby controlfile名称.
 
//db_file_name_convert需要注意,这里作为standby角色时,明确告诉其应用的redo,生成的文件路径,备库上这个配置一定遵循'primary','standby'的格式。log_file_name_convert与上面db_file_name_convert如出一辙。
db_file_name_convert主要告诉standby库,当primary端新建表空间、数据文件时,新建立的数据文件很可能是$oradata/beijing,而在standby端相应的路径很可能不同
 
例如是:$oradata/shanghai,所有要让新的datafile出现在应该出现的目录中,需要做db_file_name_convert,当然log_file_name_convert是一个道理。
 
总结一个大原则:初始的standby上面不要配置切换后的角色参数,也就是没有主库上的"standby role configuration"
参考配置片段
———————————————————————————————————
*.db_name='beijing'
*.db_recovery_file_dest='C:oracleproduct10.2.0flash_recovery_area'
*.db_recovery_file_dest_size=2147483648
*.dispatchers='(PROTOCOL=TCP) (SERVICE=shanghaiXDB)'
*.fal_client='SHANGHAI'
*.fal_server='BEIJING'
*.job_queue_processes=10
*.log_archive_config='dg_config=(beijing,shanghai)'
*.log_archive_dest_1='location=c:arch1shanghai valid_for=(all_logfiles,all_roles) db_unique_name=shanghai'
*.log_archive_dest_2='service=beijing LGWR ASYNC valid_for=(online_logfiles,primary_role) db_unique_name=beijing'
*.log_archive_max_processes=30
*.log_file_name_convert='beijing','shanghai'
———————————————————————————————————–
 
 
//拷贝主库的数据文件至备库
需要结合standby库上实例名称是shanghai,需要准备一些必要的目录结构,例如:oradata/shanghai,bdump,cdump等等。
 
//在standby上创建instance的服务,在win平台及时设置了auto参数,可能仍然需要手工去service.msc中更改成"自动"
oradim -new -sid shanghai -startmode auto
 
//创建pwdfile,密码和primary的最好一样
orapwd file=C:oracleproduct10.2.0db_1dbsPWDshanghai.ora password=password entries=5
 
//listener.ora与tnsnames.ora
primary与standby的listener和tnsnames.ora都应该是一样的,如果有不一样,也应该仅是port和hostname不一样。
 
//使用构造的initfile构造一个spfile,然后启动standby的库
C:Documents and SettingsAdministrator>sqlplus / as sysdba
 
SQL*Plus: Release 10.2.0.4.0 – Production on 星期六 11月 10 22:18:47 2012
 
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
 
已连接到空闲例程。
 
SQL> create spfile from pfile='c:initSHANGHAI.ora';
 
//starup moun  启动到mount阶段
 
————————————————————–
C:Documents and SettingsAdministrator>set ORACLE_SID=shanghai
 
C:Documents and SettingsAdministrator>sqlplus / as sysdba
 
SQL*Plus: Release 10.2.0.4.0 – Production on 星期六 11月 10 22:18:47 2012
 
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
 
已连接到空闲例程。
 
SQL> create spfile from pfile='c:initSHANGHAI.ora';
 
文件已创建。
 
SQL> startup mount
ORA-02778: Name given for the log directory is invalid
SQL> create spfile from pfile='c:initSHANGHAI.ora';
 
文件已创建。
 
SQL> startup mount
 
SQL> startup  mount from pfile='c:initSHANGHAI.ora'
ORA-02778: Name given for the log directory is invalid
———————————————————-
 
上面这堆报错是因为有些目录没有创建正确,例如bdump应该在shanghai这个目录下!!!
 
//创建完应有的目录后,再启动到mount阶段就没有报错了
 
SQL> startup mount
ORACLE 例程已经启动。
 
Total System Global Area  209715200 bytes
Fixed Size                  1295872 bytes
Variable Size             113248768 bytes
Database Buffers           92274688 bytes
Redo Buffers                2895872 bytes
数据库装载完毕。
 
 
//和primary一样,添加一些standby redo log group,个数、大小都参照前面primary的操作
//控制文件中含有primary上创建的standby redo log信息,路径不一样,这里需要注意。
//建议drop掉,重建。
 
alter database add standby logfile group 4 ('c:shanghaistandbyredologstb_redo04.log') size 50M
alter database add standby logfile group 5 ('c:shanghaistandbyredologstb_redo05.log') size 50M
alter database add standby logfile group 6 ('c:shanghaistandbyredologstb_redo06.log') size 50M
alter database add standby logfile group 7 ('c:shanghaistandbyredologstb_redo07.log') size 50M
 
———————————————————–
 
//本身从standby拷贝过来的standby.ctl文件中留存了standby redo logfile的信息,需要先drop掉,才能添加。
SQL> select group#,thread#,sequence# from v$standby_log;
 
    GROUP#    THREAD#  SEQUENCE#
———- ———- ———-
         4          0          0
         5          0          0
         6          0          0
         7          0          0
 
SQL> alter database drop standby logfile group 4;
 
数据库已更改。
 
SQL> alter database drop standby logfile group 5;
 
数据库已更改。
 
SQL> alter database drop standby logfile group 6;
 
数据库已更改。
 
SQL> alter database drop standby logfile group 7;
 
数据库已更改。
 
SQL> alter database add standby logfile group 4 ('c:shanghaistandbyredologstb_redo04.log') size 50M;
 
数据库已更改。
 
SQL> alter database add standby logfile group 5 ('c:shanghaistandbyredologstb_redo05.log') size 50M;
 
数据库已更改。
 
SQL> alter database add standby logfile group 6 ('c:shanghaistandbyredologstb_redo06.log') size 50M;
 
数据库已更改。
 
SQL> alter database add standby logfile group 7 ('c:shanghaistandbyredologstb_redo07.log') size 50M;
 
数据库已更改。
 
 
———————————————————–
 
———配置完毕后,开始最重要的redo apply———————-
 
alter database recover managed standby database disconnect from session
 
//standby监听的问题
 
standby的参数文件中少了,db_unique_name一项,可能会造成如下的问题。
——————————-
*.db_name='beijing'
*.db_unique_name='shanghai'
——————————-
 
没有更改之前,standby上的lsnrctl status中的信息是这样的。
 
监听端点概要…
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=standby)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=.pipeEXTPROC0ipc)))
服务摘要..
服务 "PLSExtProc" 包含 1 个例程。
  例程 "PLSExtProc", 状态 UNKNOWN, 包含此服务的 1 个处理程序…
服务 "beijing.world" 包含 1 个例程。
  例程 "shanghai", 状态 READY, 包含此服务的 1 个处理程序…
服务 "shanghai_XPT.world" 包含 1 个例程。
  例程 "shanghai", 状态 READY, 包含此服务的 1 个处理程序…
命令执行成功
 
 
注意:beijing.world…
查询service名称,发现是beijing.world(standby上应该是shanghai.world),肯定是db_unique_name在闹鬼,估计没有设置正确。
 
SQL> show parameter service
 
NAME                                 TYPE        VALUE
———————————— ———– —————–
service_names                        string      beijing.world
 
查询一下primary的v$archive_dest上是否有未生效的参数
 
SQL> select dest_name,status,error from v$archive_dest where error is not null;
 
DEST_NAME                      STATUS    ERROR
—————————— ——— —————————————
LOG_ARCHIVE_DEST_2             ERROR     ORA-12154: TNS:could not resolve the connect identifier specified
 
 
而在primary的trace文件、alert中可能有如下的错误信息:
 
————————————————
Error 12541 connecting to destination LOG_ARCHIVE_DEST_2 standby host 'shanghai'
Error 12541 attaching to destination LOG_ARCHIVE_DEST_2 standby host 'shanghai'
ORA-12541: TNS: 无监听程序
*** 2012-11-10 19:56:32.294 62692 kcrr.c
PING[ARCm]: Heartbeat failed to connect to standby 'shanghai'. Error is 12541.
————————————————-
 
果不其然!在重新给standby生成参数文件,将db_unique_name='shanghai'加进去。(缺省是没有这个参数的)
 
重启standby到mount状态,再查看lsnrctl status状态
—————————————————————————–
监听端点概要…
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=standby)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=.pipeEXTPROC0ipc)))
服务摘要..
服务 "PLSExtProc" 包含 1 个例程。
  例程 "PLSExtProc", 状态 UNKNOWN, 包含此服务的 1 个处理程序…
服务 "shanghai.world" 包含 1 个例程。
  例程 "shanghai", 状态 READY, 包含此服务的 1 个处理程序…
服务 "shanghai_XPT.world" 包含 1 个例程。
  例程 "shanghai", 状态 READY, 包含此服务的 1 个处理程序…
命令执行成功
 
———————————————————————
 
SQL> show parameter service
 
NAME                                 TYPE        VALUE
———————————— ———– ——————
service_names                        string      shanghai.world
 
//重新apply redo
alter database recover managed standby database disconnect from session
 
不断的切换日志,查看主备两边v$archived_log的情况.(sequence#)
 
 
//在primary添加一个数据文件试试
在primary添加了一个数据文件,但是standby并没有自动的生成,我怀疑db_file_name_convert和log_file_name_convert设置不正确。
 
//的确,db_file_name_convert=('path1','path2'),应当参照这个设置,更改之后就能够同步两边的tablespace add和datafile add操作了。
 
 
–EOF–

oracle dataguard physical standby test(1)

 

//主库配置-oracle软件安装
2012-10-29
安装oracle10205 base on win2k3 x64
db_name=orcl
sid=orcl
字符集=zhs16gbk
sys/abcdefg
oracle_base=C:oracleproduct10.2.0
oracle_home=C:oracleproduct10.2.0db_1
 
//备库配置-oracle软件安装
2012-10-29
安装oracle10205 base on win2k3 x64
 
//主库配置-归档、force logging模式
alter database force logging;
 
———变更成归档模式———————————————————————–
startup mount
alter database archivelog
alter system set log_archive_dest_1='LOCATION=e:arch' scope=spfile
—————————————————————————————————–
 
//创建文本型的初始化参数文件给后续的备库使用(后续启动备库使用)
SQL> create pfile='e:initorcl.ora' from spfile
 
//在主库创建备库的控制文件
 
SQL> alter database create standby controlfile as 'e:controlbak.ctl'
 
//配置主库的监听
主库的监听配置文件如下
# listener.ora Network Configuration File: C:oracleproduct10.2.0db_1networkadminlistener.ora
# Generated by Oracle configuration tools.
 
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = C:oracleproduct10.2.0db_1)
      (PROGRAM = extproc)
    )
(SID_DESC =
(GLOBAL_DBNAME= orcl)
(ORACLE_HOME = C:oracleproduct10.2.0db_1)
(SID_NAME = orcl)
)
)
 
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = jack)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )
 
———————————————————————————————————————
 
//配置主库的tnsnames.ora,加入对primary和standby的tns解析名字段
 
配置注意:service name都一样。网络名分别为primary和standby
 
//配置主库传输归档日志到备库的参数
SQL> alter system set log_archive_dest_2='service=standby reopen=30 DB_UNIQUE_NAME=orcl' scope=spfile;
 
关闭主库,冷备相关的文件,涉及的文件:数据文件、pwdfile、构造的init文本参数文件。
 
//拷贝完毕所需的文件后,开始启动主库
 
—–开始进入备库的配置——-
//用oradim工具创建备库的orcl实例并设置服务为自动启动的
C:Documents and SettingsAdministrator>oradim -new -sid orcl -startmode m
Instance created.
 
C:Documents and SettingsAdministrator>oradim -edit -sid orcl -startmode a
通常即使设置了startmode为auto,后续最好去services.msc上查看一些,也许它还是"手工"的,以免不必要的麻烦。
 
//创建密码文件
将从主库拷贝过来的pwdfile拷贝至备库的$ORACLE_BASE/database目录下
 
//在备库主机上创建相应的目录:bdump、cdump、udump、pfile、adump、dpdump、oradataorcl
mkdir C:oracleproduct10.2.0adminorclbdump
mkdir C:oracleproduct10.2.0adminorclcdump
mkdir C:oracleproduct10.2.0adminorcladump
mkdir C:oracleproduct10.2.0adminorclpfile
mkdir C:oracleproduct10.2.0adminorcldpdump
mkdir C:oracleproduct10.2.0adminorcludump
mkdir C:oracleproduct10.2.0oradataorcl
 
//在备库创建监听和配置tnsnames.ora
监听的配置文件内容参考
—————————————————————————————————
# listener.ora Network Configuration File: C:oracleproduct10.2.0db_1networkadminlistener.ora
# Generated by Oracle configuration tools.
 
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = C:oracleproduct10.2.0db_1)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_NAME = orcl)
      (ORACLE_HOME = C:oracleproduct10.2.0db_1)
      (SID_NAME = orcl)
    )
  )
 
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = peter)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )
——————————————————————————————————
 
备库的tnsnames.ora中需要配置对于standby与primary的tns解析,过程与主库的一致,不赘述。
 
//配置备库的初始化参数文件
编辑文本的参数文件,添加如下的内容
*.standby_file_management=AUTO
*.remote_archive_enable=TRUE
*.standby_archive_dest='c:arch'
*.fal_server='PRIMARY'
*.fal_client='STANDBY'
注意:在原initorcl.ora文本文件中包含了主库一些配置信息,例如:归档的位置(log_archive_dest_1)需要注意。
 
//配置备库的controlfile
将在主库中生成的standby控制文件复制三份,放到oradata/orcl目录中。
 
 
//启动备库
 
SQL> create spfile from pfile;
 
File created.
 
SQL> startup noumount
SP2-0714: invalid combination of STARTUP options
SQL> startup nomount
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.
 
Total System Global Area  293601280 bytes
Fixed Size                  2077040 bytes
Variable Size             100667024 bytes
Database Buffers          184549376 bytes
Redo Buffers                6307840 bytes
SQL> select status from v$instance;
 
STATUS
————————
STARTED
 
SQL> alter database mount standby database;
 
Database altered.
 
SQL> alter database recover managed standby database disconnect from session;
 
Database altered.
 
//进行一些基本的测试工作
在主、备库之间测试一下互相的ping、tnsping,查看是否连通
 
测试生产主库的归档日志传输情况
在主库上执行,如下命令,然后在主库的归档路径与备库的归档路径上是否有相应的归档日志文件
SQL> alter system switch logfile;
 
System altered.
 
主库归档路径:log_archive_dest_1
备库的归档路径:standby_archive_dest
在主库的e:arch目录和备库的c:arch目录中都能产生相应的归档日志文件
 
————–主备切换测试——————
在主库的scott用户下创建一个测试表(test),里面是dba_users中的内容
 
//主库中增加如下的参数
alter system set standby_file_management=auto scope=spfile
alter system set remote_archive_enable=true scope=spfile
alter system set standby_archive_dest='e:standby_archive_dest' scope=spfile
alter system set fal_server='standby' scope=spfile
alter system set fal_client='primary' scope=spfile
 
//查看switchover状态
在没有切换主库到备库之前的switchover状态
SQL> select switchover_status from v$database;
 
SWITCHOVER_STATUS
——————–
SESSIONS ACTIVE
 
session active说明有session连接在上面。
 
开始切换主库到备库
SQL> alter database commit to switchover to physical standby;
 
Database altered.
 
//启动到mount和应用日志状态
SQL> shutdown immediate
ORA-01507: database not mounted
 
 
ORACLE instance shut down.
SQL> startup nomount
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.
 
Total System Global Area  293601280 bytes
Fixed Size                  2077040 bytes
Variable Size             100667024 bytes
Database Buffers          184549376 bytes
Redo Buffers                6307840 bytes
SQL> alter database mount standby database;
 
Database altered.
 
SQL> alter database recover managed standby database disconnect from session;
 
Database altered.
 
 
————–备库操作—————-
 
//在备库中增加如下的参数
SQL> select status from v$instance;
 
STATUS
————
MOUNTED
 
SQL> alter system set log_archive_dest_1='location=c:arch_after' scope=spfile;
 
System altered.
 
alter system set log_archive_dest_2='service=primary valid_for=(online_logfiles,primary_role) db_unique_name=orcl' scope=spfile
SQL> alter system set log_archive_dest_2='service=primary valid_for=(online_logfiles,primary_role) db_unique_name=orcl' scope=spfile;
 
System altered.
 
//停止应用日志,这个操作在主库上执行
alter database recover managed standby database cancel
 
//在备库上执行切换到主库
SQL> alter database commit to switchover to primary;
 
SQL> shutdown immediate
ORA-01109: database not open
 
 
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.
 
Total System Global Area  293601280 bytes
Fixed Size                  2077040 bytes
Variable Size             100667024 bytes
Database Buffers          184549376 bytes
Redo Buffers                6307840 bytes
Database mounted.
Database opened.
SQL> alter system switch logfile;
 
System altered.
 
SQL> archive log list;
Database log mode               Archive Mode
Automatic archival                  Enabled
Archive destination                 c:arch_after
Oldest online log sequence     9
Next log sequence to archive   10
Current log sequence           10
SQL>
 
//切换之后查看switchover_status(在原来的备库上查询,切换后的主库)
SQL> select switchover_status from v$database;
 
SWITCHOVER_STATUS
——————–
TO STANDBY
 
//在oracle online doc上,这个字段的解释如下
TO STANDBY – This is a primary database and is allowed to switch over to a standby database
 
//切换之后,查看当前主库的角色和状态。
SQL> select database_role,protection_mode from v$database;
 
DATABASE_ROLE    PROTECTION_MODE
—————- ——————–
PRIMARY          MAXIMUM PERFORMANCE
 
//查看切换之后,备注中scott用户下是否有test这个对象,其中的内容是否存在,同时与主库一致。
 
–EOF–