Bad header found during buffer read

一个客户的数据库遭遇突然的服务停止,基本的信息为”oracle 9201+windows server 2003 32bit”。客户反馈,在某天晚上,开发人员发现连接不上数据库了,于是登上服务器
发现,OracleServiceSid的服务异常停止了。。。。查看oracle的alert.log发现如下信息。

Current log# 3 seq# 114865 mem# 0: D:ORACLEORADATASINITEKREDO03.LOG
***
Corrupt block relative dba: 0x03c1f9d9 (file 15, block 129497)
Bad header found during buffer read
Data in bad block –
type: 6 format: 2 rdba: 0x03c1f9e0
last change scn: 0x0000.07474aa1 seq: 0x1 flg: 0x06
consistency value in tail: 0x4aa10601
check value in block header: 0xdff2, computed block checksum: 0x0
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

最初怀疑是corrupt block的问题。于是根据file_id=15,block 129497来定位损坏的block位置。
SQL> select file_id,file_name from dba_data_files;
FILE_ID FILE_NAME
———- ———————————————
15 D:ORACLEORADATAXXXXXXCSJJ.ORA

使用dbv工具来检查此数据文件坏块的问题。
C:Documents and SettingsAdministrator>dbv blocksize=8192 file=D:ORACLEORADATAXXXXXCSJJ.ORA

DBVERIFY: Release 9.2.0.1.0 – Production on 星期一 12月 26 09:54:37 2011

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

DBVERIFY – 验证正在开始 : FILE = D:ORACLEORADATAXXXXXXCSJJ.ORA

DBVERIFY – 验证完成

检查的页总数 :332800
处理的页总数(数据):131350
失败的页总数(数据):0
处理的页总数(索引):95869
失败的页总数(索引):0
处理的页总数(其它):3429
处理的总页数 (段) : 0
失败的总页数 (段) : 0
空的页总数 :102152
标记为损坏的总页数:0
汇入的页总数 :0

擦。。。竟然没查出来坏块,下面来定位坏块的位置。

select owner,segment_name,tablespace_name from dba_extents where file_id=15 and 129497 between block_id and block_id+blocks-1
owner segment_name tablespace_name
—— ————- —————-
GSXX FNDXXXX CSXX

对FNDXXXX做查询,看看是否报错,分别”走索引”、”不走索引”

select /*+ no_index(GSXX.FNDXX GSXX.PK_FNDXX) */ count(*) from GSXX.FNDXX
返回结果正常

再尝试走索引的查询
select count(*) from GSXX.FNDXX
结果集返回也正常。。

擦。。。。有点没头绪了。alert中有”corrupt”的信息,但是dbv、正常的查询都没有遭遇报错,而且db在手工重启OracleServiceSID之后,运行正常,暂时没有遭遇报错,客户问我怎么回事。。我只能说:先走着瞧吧。。

–EOF–

There have been too many unsuccessful login attempts; please see the system administrator

用户打来电话,在AIX上登录账户的时候遭遇报错,无法登录。
There have been too many unsuccessful login attempts; please see the system administrator
询问了客户,得知之前做过安全加固,对于登录次数、失败的次数做过严格限制,导致在多次账户密码输入错误后,登录遭遇报错。
以root用户查询/etc/security/lastlog文件

xjapan:

time_last_login = 1249055415

tty_last_login = /dev/pts/15

host_last_login = 58.22.58.88

unsuccessful_login_count = 20

time_last_unsuccessful_login = 1249057164

tty_last_unsuccessful_login = ftp

host_last_unsuccessful_login = 58.22.58.110

“unsuccessful_login_count”表示登录失败的次数,应该是尝试了20次登录没成功,而且最后一次不成功来自ftp客户端。

安全加固是双刃剑,如果在small business尤其就两个技术人员的地方,实在没必要自找麻烦。

-EOF-

amazon RDS for oracle

amazon真是个伟大的公司!这不,在微博上看到了release amazon RDS on oracle的消息。慢慢打开了封闭企业的大门,要知道在金融行业中,oracle的占有率很高,在传统的oltp类型的应用中,能与oracle竞争的对手真不多。但是我好奇企业对于私密数据的安全性是否存有担忧?尤其在国内。

Amazon RDS for Oracle Database

Amazon RDS makes it easy to set up, operate, and scale Oracle Database deployments in the cloud. With Amazon RDS, you can deploy multiple editions of Oracle Database 11g in minutes with cost-efficient and re-sizable hardware capacity. Amazon RDS frees you up to focus on application development by managing time-consuming database administration tasks including provisioning, backups, software patching, monitoring, and hardware scaling.

You can run Amazon RDS for Oracle under two different licensing models – “License Included” and “Bring-Your-Own-License (BYOL)”. In the “License Included” service model, you do not need separately purchased Oracle licenses; the Oracle Database software has been licensed by AWS. “License Included” pricing starts at $0.16 per hour, inclusive of software, underlying hardware resources, and Amazon RDS management capabilities. If you already own Oracle Database licenses, you can use the “BYOL” model to run Oracle deployments on Amazon RDS, with rates starting at $0.11 per hour. The “BYOL” model is designed for customers who prefer to use existing Oracle database licenses or purchase new licenses directly from Oracle.

You can take advantage of hourly pricing with no upfront fees or long-term commitments. In addition, you also have the option to purchase Reserved DB Instances under one or three year reservation terms. With Reserved DB Instances, you can make low, one-time, upfront payment for each DB Instance and then pay a significantly discounted hourly usage rate, achieving up to 48% net cost savings.

详细可以参考aws的官方网站http://aws.amazon.com/rds/oracle/

aix上使用xmanager启动图形的界面

在给客户实施时经常会遇到使用aix上的图形界面的情况,例如安装oracle的时候如果对slient方式不是很熟悉,可以考虑dbca图形创建数据库实例。而redhat linux下面使用比较多的是vncserver启动gnome或者kde,在aix下面,使用较多的是xmanager去连接aix server的图形环境:例如CDE(前提是安装了图形界面)。在aix端比较方便,无需什么配置,只需要支持XDMCP协议即可,而aix上面CDE的xdmcp服务器为dtlogin,监听的端口一般为177。netstat -an|grep 177查询一下即可。
安装完Xmanager后,在Xbrowser中New一个XDMCP,选择XDM query,在Host输入框输入主机名或IP 地址,选Apply,然后Done,启动Xmanager就会显示图形登录介面了。

简单在图形界面中开启的terminal中输入如下:

su – oracle//su到oracle用户下面
设置display
trade_test:/local/oracle#export DISPLAY=192.168.201.124:0.0
trade_test:/local/oracle#dbca//图形启动了..

备注:
export DISPLAY=YourIPAddress:x.0

YourIPAddress是你运行Xmanager 的机器的IP地址。x是Xmanger那个窗口上提示的数字。一般是0。
然后就可以启动OUI安装了。

open virtualization format

ovf是各个虚拟机厂家之间的一个交互标准,例如vmware workstation–>esx—>virtualbox等等。如果你想让vmware workstation中的虚拟机能够正确在esx或者ovf中run起来,可以参考使用ovftool将虚拟机文件转换成ovf格式的。
ovftool.exe “$path/source.vmx” “$path/target.ovf”
然后可以将通用的ovf导入目标端。

oracle recycle bin

在10g后,oracle出了个叫recycle bin的东西,粗浅的理解成”回收站”就可以。
表空间的Recycle Bin 区域只是一个逻辑区域,而不是从表空间上物理的划出一块区域固定用于回收站,因此Recycle Bin是和普通对象共用表空间的存储区域,或者说是Recycle Bin的对象要和普通对象抢夺存储空间。当发生空间不够时,Oracle会按照先入先出的顺序覆盖Recycle Bin中的对象。所以为了避免这类资源的消耗,建议管理员不定期的看看recyle bin中的对象,定时清理。
也可以手动的删除Recycle Bin占用的空间:
1). Purge tablespace tablespace_name : 用于清空表空间的Recycle Bin
2). Purge tablespace tablespace_name user user_name: 清空指定表空间的Recycle Bin中指定用户的对象
3). Purge recyclebin: 删除当前用户的Recycle Bin中的对象
4). Purge dba_recyclebin: 删除所有用户的Recycle Bin中的对象,该命令要sysdba权限
5). Drop table table_name purge: 删除对象并且不放在Recycle Bin中,即永久的删除,不能用Flashback恢复。
6). Purge index recycle_bin_object_name: 当想释放Recycle bin的空间,又想能恢复表时,可以通过释放该对象的index所占用的空间来缓解空间压力。 因为索引是可以重建的

Fine Grained Auditing (FGA) is independant to oracle standard audit

在oracle-base上找到一篇介绍FGA的文章,取了片段引用过来。原文可以参考http://www.oracle-base.com/articles/10g/Auditing_10gR2.php
FGA与oracle standard audit是相对独立的,具体可以参考引用的例子,长久以来错误的以为oracle standard audit是FGA的基础,惭愧啊。
Fine Grained Auditing (FGA)
Fine grained auditing extends Oracle standard auditing capabilities by allowing the user to audit actions based on user-defined predicates. It is independant of the AUDIT_TRAIL parameter setting and all audit records are stored in the FGA_LOG$ table, rather than the AUD$ table. The following example illustrates how fine grained auditing is used.

First, create a test table.
CONN audit_test/password CREATE TABLE emp ( empno NUMBER(4) NOT NULL, ename VARCHAR2(10), job VARCHAR2(9), mgr NUMBER(4), hiredate DATE, sal NUMBER(7,2), comm NUMBER(7,2), deptno NUMBER(2) ); INSERT INTO emp (empno, ename, sal) VALUES (9999, ‘Tim’, 1); INSERT INTO emp (empno, ename, sal) VALUES (9999, ‘Larry’, 50001); COMMIT;
The following policy audits any queries of salaries greater than £50,000.
CONN sys/password AS sysdba BEGIN DBMS_FGA.add_policy( object_schema => ‘AUDIT_TEST’, object_name => ‘EMP’, policy_name => ‘SALARY_CHK_AUDIT’, audit_condition => ‘SAL > 50000’, audit_column => ‘SAL’); END; /
Querying both employees proves the auditing policy works as expected.
CONN audit_test/password SELECT sal FROM emp WHERE ename = ‘Tim’; SELECT sal FROM emp WHERE ename = ‘Larry’; CONN sys/password AS SYSDBA SELECT sql_text FROM dba_fga_audit_trail; SQL_TEXT —————————————— SELECT sal FROM emp WHERE ename = ‘Larry’ 1 row selected. SQL>
Extra processing can be associated with an FGA event by defining a database procedure and associating this to the audit event. The following example assumes the FIRE_CLERK procedure has been defined:
BEGIN DBMS_FGA.add_policy( object_schema => ‘AUDIT_TEST’, object_name => ‘EMP’, policy_name => ‘SALARY_CHK_AUDIT’, audit_condition => ‘SAL > 50000’, audit_column => ‘SAL’, handler_schema => ‘AUDIT_TEST’, handler_module => ‘FIRE_CLERK’, enable => TRUE); END; /
The DBMS_FGA package contains the following procedures:
ADD_POLICY
DROP_POLICY
ENABLE_POLICY
DISABLE_POLICY
In Oracle9i fine grained auditing was limited queries, but in Oracle 10g it has been extended to include DML statements, as shown by the following example.
— Clear down the audit trail. CONN sys/password AS SYSDBA TRUNCATE TABLE fga_log$; SELECT sql_text FROM dba_fga_audit_trail; no rows selected. — Apply the policy to the SAL column of the EMP table. BEGIN DBMS_FGA.add_policy( object_schema => ‘AUDIT_TEST’, object_name => ‘EMP’, policy_name => ‘SAL_AUDIT’, audit_condition => NULL, — Equivalent to TRUE audit_column => ‘SAL’, statement_types => ‘SELECT,INSERT,UPDATE,DELETE’); END; / — Test the auditing. CONN audit_test/password SELECT * FROM emp WHERE empno = 9998; INSERT INTO emp (empno, ename, sal) VALUES (9998, ‘Bill’, 1); UPDATE emp SET sal = 10 WHERE empno = 9998; DELETE emp WHERE empno = 9998; ROLLBACK; — Check the audit trail. CONN sys/password AS SYSDBA SELECT sql_text FROM dba_fga_audit_trail; SQL_TEXT ————————————– SELECT * FROM emp WHERE empno = 9998 INSERT INTO emp (empno, ename, sal) VALUES (9998, ‘Bill’, 1) UPDATE emp SET sal = 10 WHERE empno = 9998 DELETE emp WHERE empno = 9998 4 rows selected. — Drop the policy. CONN sys/password AS SYSDBA BEGIN DBMS_FGA.drop_policy( object_schema => ‘AUDIT_TEST’, object_name => ‘EMP’, policy_name => ‘SAL_AUDIT’); END; /

ORA-04030 out of memory

客户现场的数据库报错了,一个b/s结构的web系统,在夜间跑一个batch任务的时候报错退出,job失败一般reload一下就行了,对前台的客户倒是影响也不大。
ORA-04030: out of process memory when trying to allocate 24 bytes (top call heap,PGA heap pool)
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn
skgpspawn failed:category = 27143, depinfo = 9261, op = spdcr, loc = skgpspawn

win32 4G ram,oracle 9201,530M的pga,sga_max_size给了1.5G。当然了,9i下面,sga的使用是在各个pool中定义的。sga_max_size是db_cache,share_pool等等能够调节之后,使用的最大值。一般来说,db_cache_size+share pool之后会小于这个值。遇到这个报错多半是os上的资源不足够了。win32bit下面,user process能够使用的最大内存<2G,这里的user process是除了系统进程以外的所有程序进程,可以理解为:除了系统以外,其余非系统进程所能使用的。当然,oracle.exe肯定是其中之一而已。有人说开启os上的PAE,/3G设置能够解决此问题。我觉得仅仅能缓解而已。如果pga确实满足不了app,os上还是会陆续的报错,类似"unable to begin xxx thread"之类的。解决方法?建议更换成64bit的吧。或者缩小pga..试试。

CRS-0184: Cannot communicate with the CRS daemon

环境rhel 4u8+oracle rac+nfs
当你想在oracle用户下面通过crs_start -all来启动所有资源的时候,可能会遇到如下类似的情形。
[oracle@rac2 dbs]$ crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
这个时候需要去查看一下当前节点上的crs进程是否正常。
[root@rac2 bin]# ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM
[root@rac2 bin]# ps -ef|grep crs
root 10816 1 0 09:59 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 21639 30579 0 10:32 pts/1 00:00:00 grep crs

可见,这不是正常启动状态下crs的输出信息。
CRS正常情况下的状态输出可以参考如下:
[root@rac1 bin]# ./crsctl check crs
CSS appears healthy

lanmanservershare

windows的共享权限都记录在注册表里面,“lanmanservershare”。如果用户遇到迁移共享文件夹的需求时,可以考虑注册表的导入导出。文件夹目录的ntfs权限应该在copy时能带过去,而共享权限都在注册表导出文件中。

–EOF–

starwind+starport做共享存储

starwind和starport是个不错的组合,如果你想在virtualbox或者vmware上测试基于共享存储的集群方案的话,可以
考虑采用iscsi的starwind服务端+starport客户端的组合。例如,mscs,oracle rac等等。。
之前有客户需求搭建模拟测试环境,在mscs+oracle做primary/standby模式测试,遇到死活找不到仲裁磁盘的错误。。
折腾了两个晚上,在朋友的建议拿starwind和starport来做共享存储,很轻松解决了,很方便。
–EOF–

esxi server to esxi server

客户最近有需求,要将一个环境中的esxi虚拟机迁移至另外一套esxi server中,因为有停机的时间,于是建议客户使用最笨的方法,利用vsphere client的工具,download和upload功能将vm的文件迁移至新的esxi server上,再浏览存储的时候,点击vmx配置文件,“添加到清单”中。方法确实很笨。。。囧。不过倒是解决了客户的需求。我想esxi应该有迁移的更好的、现成的方法,但肯定需要额外的license

create public database link vs create database link

create public database link与create database link的区别就是:public link是数据库中所有数据库用户都可以访问使用的,而普通的database link是私有的link,仅仅对创建它的用户可用。

用户可以通过dblink,和远端数据库的账户A,从本地去访问远端用户A下面的数据。
但从数据管控的角度来讲,存在相对于A用户的远端源头,能够操作A下面用户的风险